For many applications—especially those requiring diversified 
computations in real time or high reliability without massive 
redundancy—multiple microprocessor systems are the logical choice. 


Multiple Microprocessor Systems: 


What, Why, and When 


Eli T. Fathi and Moshe Krieger 
University of Ottawa 


Miuttiple microprocessor systems can provide an ap- 
propriate solution to the demand for additional com- 
puting power to meet new requirements and to support 
complex applications. To clarify the concept and its asso- 
ciated terminology, this article considers the ‘‘what,’’ 
‘‘why,’’? and ‘‘when’’ of multiple microprocessor sys- 
tems. Aspects that apply to all processors, regardless of 
size, are presented in general terms. However, since our 
main interest lies with microprocessor-based systems, 
aspects that depend on processor power and input/out- 
put flexibility are related specifically to microprocessors. 

As the number of applications with more elaborate 
computational demands increases, we need to provide 
more processing power. This can be achieved at the pro- 
cessor level, by relying on technological improvements to 
push the microprocessor beyond its current maximum 
capabilities, or at the system level, by extending the capa- 
bilities of a single microprocessor through concurrent ex- 
ecution. 

Microprocessor technological improvements are made 
exclusively by the manufacturers, and even though user 
feedback may influence developments, they are beyond 
the control of the average user. There are definite limits 
to the extent and type of improvements possible with ex- 
isting technology. Thus, unless new technology is devel- 
oped, all technological improvements can be regarded as 
evolutionary rather than revolutionary. As such, they 
concentrate on the following areas: 


¢ Relatively simple applications. Provide more com- 
plex single-chip systems by adding memory and 
various I/O capabilities to the chip. 

¢ Moderately complex applications. Increase micro- 
processor capability and/or throughput by provid- 
ing longer words, more addressing capability, more 
extensive and powerful instruction sets, and higher 
operating speeds coupled with reduced power con- 
sumption. 


March 1983 


¢ Complex applications. Facilitate modular increases 
to system performance by introducing additional 
control lines to implement multiple microprocessor 
architectures and by adding various software sup- 
port functions to the chip. 


The last point indicates a definite trend by the industry 
to move toward multiple microprocssor systems by pro- 
viding the hardware and software support that facilitates 
their design. 

At the system level, performance can be enhanced by 
exploiting the concept of concurrent execution. This re- 
quires segmenting the process into tasks and using a real- 
time multitasking executive to schedule, control, and 
synchronize the various tasks. The result can be either 
apparent concurrency using a single microprocessor in a 
timeshared mode or true concurrency using a multiple 
microprocessor system. 

The first approach, useful only in small special- 
purpose systems, seeks maximum resource utilization 
from a single microprocessor. Because microprocessors 
are physically limited, this method is more appropriate 
for minicomputers or mainframes. The second method, 
however, attempts complete utilization of the system, 
not of individual microprocessors. In that context, a 
system is balanced whenever the work load can be evenly 
distributed among system elements. Distribution of the 
work load is referred to as load sharing and can be either 
dynamic (accomplished during system operation) or sta- 
tic (fixed during the design phase). With the addition of 
more processors for a multiple microprocessor system, 
the system can be regarded as being independent of the 
microprocessor’s physical characteristics. Thus, theo- 
retically, this method provides unlimited room for ex- 
pansion and improvement. 

The idea of using more than one processing element to 
improve system performance preceded the development 
of microprocessors, but that technology now permits the 


0018-9162. 83 /0300-0023$01.00 © 1983 TERE 


23 


24 


use of computing power in a wide range of applications 
that had been impractical because of the computer’s cost 
and physical size. The use of multiple microprocessor 
systems extended the range and capabilities of single 
microprocessors to more complex areas previously in the 
domain of large computers and has led to other system 
enhancements—for example, improved reliability and 
ease of design. 


What are multiple microprocessor systems? 


Logical structure. To provide harmonious operation, 
the logical relations among the various elements of a 
multiple processor system must be well defined. Here, 
logical structure refers to the way the control respon- 
sibility is distributed among the sysem elements. The two 
most obvious relations are vertical and horizontal. In a 
vertical system, elements are hierarchically structured, 
implying a master-slave relation; in a horizontal system, 
the elements are logically equal, implying a master- 
master relation. 


Vertical organization. In its simplest form, a vertical 
organization has a single master with multiple slaves and 
has the following main characteristics: 


¢ Not all elements are logically equal. 

e At any given time only one element can act as a 
master; however, several elements may have the 
potential of becoming a master. 

e Allinterprocessor communications must go through 
the master or be initiated by the master. 

e The slaves’ hardware may be identical, with 
customizing to a special task done via the software. 


In vertical organizations, number crunching is usually 
done by the master processor (generally the most power- 
ful) and I/O processing by slave processors, thus achiev- 
ing very high throughput.! Systems may also contain 
more than one level of master-slave arrangement, thus 
forming a pyramid configuration. 


Horizontal organization. Horizontally organized 
systems require more sophisticated coordination. They 
have the following main characteristics: 


e All elements are logically equal. 

¢ Coordination can be done with or without a floating 
controller. 

* Any element can communicate with any other ele- 
ment in the system. 


In general, horizontal systems are more flexible than 
vertical systems and are capable of dynamic load sharing. 
However, they are not efficient for applications having 
many vastly different tasks. 

Except in the case of the newest high-performance 
microprocessors, the large processing overhead required 
to coordinate horizontal systems precludes their effec- 
tiveness for multiple microprocessor systems. Vertical 
Organizations are more appropriate. 


Physical structure. The physical structure of a multiple 
processor system refers to the method of information ex- 
change and is a function of the interprocessor communi- 
cation arrangement and the interconnection topology. 


Interprocessor communication arrangement. Data 
transfers between processors can be carried out either via 
a common memory structure or via a bus structure, often 
referred to as centralized and distributed structures, re- 
spectively.” In the common memory structure, all data 
transfers are via the common memory, and elements 
have no direct access to each other. In the system bus 
structure, a logical link established on the bus structure 
creates a communication path between elements; in the 
most general case, data transfers are initiated and per- 
formed in a distributed fashion. 

In systems with frequent and/or large data transfers, 
the above extreme arrangements are not efficient due to 
increased contention for the shared resource. This prob- 
lem is further aggravated in microprocessor systems be- 
cause of the memory-processor bottleneck and limited 
I/O capability. 


Interconnection topology. We must also consider the 
way in which the elements are connected, that is, the in- 
terconnection topology. Physically, there are many ways 
of interconnecting N elements in a system, but in 
establishing the interconnection scheme, reliability and 
expandability are important factors. A reliable intercon- 
nection scheme provides an alternate path in case a link, 
a direct path between two elements, fails. An expandable 
interconnection scheme facilitates the addition of more 
elements without affecting the existing structure. The 
four most basic interconnection schemes are common 
bus, star, ring, and fully connected (see Figure 1). Other 
topologies are basically combinations or variations of 
these schemes. 

The data transfer mode and interconnection topology 
are the most basic elements of the intercommunication 
system. But in designing multiple processor systems, we 
must also decide on direct or indirect data transfer be- 
tween elements, centralized or decentralized control, 
etc. 


Mode of interaction. The two most prominent tax- 
onomies classify computer systems according to mode of 
interaction and mode of processing. Using mode of in- 
teraction as the basis, systems can be classified by the 
degree of coupling and the nature of the intercommuni- 
cation between processors. Coupling refers to the ability 
of the various elements to share resources, with the two 
extremes being loosely coupled and tightly coupled 
systems. 


Loosely coupled systems. Loosely coupled multiple 
processor organizations, also known as computer net- 
works,‘ have the following characteristics: 


e Autonomous computers. The system contains a 
number of independent computer systems that can 


be geographically dispersed. 


COMPUTER 


¢ Communication interface. The various computers 
in the system are interconnected via a communica- 
tion interface. 

¢ Communication protocol. Intercomputer com- 
munication follows rigid communication protocol. 

e Serial communication. The intercomputer com- 
munication links are generally high-speed serial 
lines. 

¢ On-site computation. In general, the network is used 
only for communication. Actual computing is done 
at a single site. 

¢ Computer accessibility. A user at any site can use the 
computing facilities at all other sites. 


The best known and largest computer network is the 
Advanced Research Projects Agency Network, which 
connects over 50 major computing facilities across the 
United States.° 


Currently, computer network organizations, with 
their rigid interprocessor communication requirements 
and extensive nodal computing requirements, are not 
directly applicable to microprocessor-based systems. 
However, with further increases in applications that re- 
quire distributed computing and enhancements in micro- 
processor technology, it is likely that modified versions 
of computer network organizations will be used soon. 


Tightly coupled systems. Also known as multiproces- 
sor systems,° tightly coupled multiple processor systems 
have the following characteristics: 


¢ Common memory. A primary memory can be ac- 
cessed by all processors in the system. In addition, 
each processor may have a separate data memory. 


* Common operating system. A single common op- 
erating system controls and coordinates all interac- 
tions between processors and processes. 


e Shared resources. I/O facilities and other system 
resources are generally shared among the pro- 
cessors. However, some resources may be dedicated 
to specific processors. 


e Equal processing power. General-purpose pro- 
cessors are symmetrically configured and exhibit 
similar capabilities. 


¢ Dynamic load sharing. Dynamically distributing the 
load of an overloaded processor permits uniform 
load sharing across all processors. 


e Processor autonomy. Each of the cooperating pro- 
cessors can execute significant computations in- 
dividually. 


e Synchronization. Synchronization between cooper- 
ating processors is needed. 


PROCESSOR N 


PROCESSOR 1 


COMMON BUS 


PROCESSOR 2 PROCESSOR J PROCESSOR 1 


PASSIVE/ACTIVE 
SWITCH 


PROCESSOR 3 PROCESSOR J PROCESSOR N PROCESSOR 3 PROCESSOR 2 


PROCESSOR 1 


PROCESSOR N PROCESSOR 1 


PROCESSOR N 


oe | 


PROCESSOR 2 | PROCESSORJ | 


[ PROCESSOR 2 


PROCESSOR J 


PROCESSOR 3 PROCESSOR 3 


Figure 1. Interconnection topologies: (a) common bus, (b) star, (c) ring, and (d) fully connected. 


March 1983 25 


26 


The major limitation of a tightly coupled multiple pro- 
cessor organization Is the possibility of primary memory 
access conflicts. This restriction tends to put an upper 
bound on the number of processors which can be effec- 
tively supported by a single operating system. Most 
processor-memory switching structures attempt to re- 
duce the amount of main memory access conflicts. The 
three most fundamental processor-memory organiza- 
tions are (1) the common bus, where all system elements 
are connected to a single bus; (2) the crossbar switch, 
where the elements are connected to a separate module, 
called the crossbar switch, which can provide several 
simultaneous connections between elements; and (3) the 
multiport memory, where each memory element has 
more than one access port and is connected to the other 
elements via a multibus system. (Further details can be 
found in Enslow.’) 

Basically a collection of low-speed processors per- 
forming the work of a single high-speed processor, the 
multiprocessor organization has been used in the imple- 
mentation of various computer systems.® However, the 
memory-processor bottleneck—characteristic of micro- 
processors—limits its applicability to multiprocessor 
organizations. Multiprocessor systems using micropro- 
cessors can be implemented in special cases where a large 
number of similar, relatively independent processes ex- 
change only small amounts of data; one example of a 
system that utilizes microprocessors is Stanford Universi- 
ty’s Minerva.? 


Distributed microprocessor systems. Tightly coupled 
and loosely coupled structures are the two extremes of 
multiple processor organization. Other structures that 
combine the better qualities of each are more suitable for 
microprocessor-based systems. These moderately cou- 
pled microprocessor-based designs are known as multi- 
microcomputer systems! and are often called distributed 
intelligence microcomputer systems, or DIMSs.!° 

In a distributed microprocessor system, the general 
work load is partitioned into relatively independent 
tasks, which can then be assigned to various system 
elements. Such a system has the following main 
characteristics. 


¢ Autonomous elements. Each individual element 
generally consists of a CPU, local program, and 
data memory, and may use or control additional 
peripherals. 

¢ Processors dedicated to atask. Ideally, each element 
is dedicated to a specific task that determines its 
relative complexity. 

e Processors with varied complexity. System con- 
figuration is not necessarily symmetrical since its 
elements vary in complexity. 

¢ Software and hardware optimization. Each ele- 
ment’s hardware and software is tailored to the 
specific task it performs. 

e Data level communication. Interprocessor com- 
munication is generally at the data level. However, 
in certain situations data may contain commands or 
include responses to specific requests. 

e Separate application and communication pro- 
cessors. In general, each element handles both I/O 


control and system communication. In the case of 
heavy communication activity, one of these func- 
tions may be @elegated to another processor. !! 

¢ Static load sharing. Because the processors are dedi- 
cated, a minimal system cannot support dynamic 
load sharing; thus, proper load balancing must be 
done during the design phase. However, some load 
sharing can be introduced by including additional 
units. 


Mode of processing. One of the earliest classifications 
of computer systems, introduced by M. J. Flynn when he 
was considering speed-up techniques, is based on instruc- 
tion and data flow.!* Flynn defined four classifications: 


e Single-instruction single-data stream. An SISD 
machine is the classical von Neumann computer that 
executes instructions sequentially, one at a time. 

e Multiple-instruction single-data stream. There 1s 
some argument as to the type of computer included 
in the MISD class.!* One candidate is the variety of 
pipeline processor that segments computations into 
consecutive stations. 

e Single-instruction multiple-data stream. Vector, ar- 
ray, and associative processors belong in the SIMD 
class. They generally have a single, central control 
unit that fetches and decodes the instructions and 
then broadcasts control to the processing elements. 

¢ Multiple-instruction multiple-data stream. The 
MIMD class, the most general one, can have differ- 
ent processors—each with its own control unit. 


Although Flynn’s classification has been accepted as 
the most basic one, it considers execution only at the in- 
struction level and is much too restrictive. To include 
more recent organizations, a number of modifications 
and extensions to Flynn’s taxonomy have been suggested 
in the literature.®!4:!9 In its most general form, a multi- 
ple processor system capable of concurrently executing a 
number of tasks, each utilizing different sets of data, can 
be called a multiple-task multiple-data, or MTMD, 
system. !® 

In terms of multiple microprocessors, SISD machines 
correspond to multi-ALU systems or uniprocessors con- 
sisting of bit-sliced microprocessors; MISD and SIMD 
machines correspond to multiple microprocessor systems 
used to implement a single special-purpose CPU; and 
MIMD machines correspond to multiple microprocessor 
systems used to implement distributed processing 
systems. The latter are mostly MTMD systems in which 
each microprocessor also has local memory. 

Since mode of processing and mode of interaction are 
closely related, the classifications based on those factors 
can be integrated as shown in Figure 2. 


Why a multiple processor system? 


To properly evaluate multiple processor systems, one 
must postulate a number of system performance mea- 
sures related to processing capabilitiy, reliability, and 
design and development. Certain properties of these 


COMPUTER 


SERIAL 


OVERLAPPED MULTI ALU 


PIPELINE ARRAY 


OPERATIONS PROCESSORS 


VON NEUMANN 
MACHINES 


PROCESSORS 


PARALLEL 


ASSOCIATIVE 
PROCESSORS 


TIGHTLY |§= MODERATELY LOOSELY 
COUPLED COUPLED COUPLED 


MULTI- DISTRIBUTED COMPUTER 
PROCESSOR SYSTEMS NETWORKS 
SYSTEMS | 


MTMD 


Figure 2. Classification tree of multiple processor organizations. 


measures are strongly related and, although included in 
only one specific category below, one may influence 
others. 


System processing capabilities. Measures related to 
system processing capabilities include cost-performance, 
throughput, and resource sharing. 


Cost-performance. On asystem level, it is generally ac- 
cepted that processor performance increases with cost; 
the question is ‘‘how?’’ In a uniprocessor system, pro- 
cessor performance increases more rapidly than cost. At 
one time, Grosch’s Law!” suggested that processor per- 
formance was proportional to the square of its cost. 
Thus, using more than one processor solely to obtain 
more raw processing power was not economical. How- 
ever, with advances in microprocessor technology, we 
are approaching the era of ‘‘no cost’’ processing power 
relative to the cost of other system parts and can now 
develop an incremental cost/performance curve that is 
more linear, as indicated in Figure 3. Thus, it is becoming 
economically attractive to use additional processors to 
increase system performance. Theoretically, the op- 
timum performance of a multiple processor system 
equals the sum of the optimum performance of each pro- 
cessor. In practice, it will be less. 

We should, of course, consider other costs associated 
with multiple processor systems. Even if we do not con- 
sider high software costs, it is obvious that the prices of 
many hardware elements—boards and connectors, for 
example—do not decrease as rapidly as those of micro- 
processors and memories. Therefore, the reduction in 
the incremental cost/performance ratio may be offset by 
other system costs. 


Throughput. System throughput, defined as the recip- 
rocal of the time required to execute a given set of algo- 
rithms, is an appropriate indicator of system perfor- 
mance. It is measured in number of operations per unit 
time. 

Ideally, throughput increases in proportion to the 
number of new processors added to a multiple processor 


March 1983 


PERFORMANCE 
GROSCH’S LAW 


UNITY CURVE 


Figure 3. Performance of a system as a function of cost. 


THROUGHPUT 


LEVELING OFF 


ACTUAL 


NUMBER OF PROCESSORS 


Figure 4. The saturation effect. 


system. In practice, due to the saturation effect, the rela- 
tionship between throughput and the number of pro- 
cessors resembles the curve shown in Figure 4. 

The saturation effect!® is defined as the degradation in 
throughput for incremental increases in the number of 


2/7 


28 


computing elements. Initially, system throughput 
follows the ideal linear curve, but as more processors are 
added to the system, throughput levels off and finally 
decreases. This effect can be attributed to the fact that as 
the number of processors increases, so does the amount 
of contention on the shared resources. There is also an 
increase in the amount of overhead information required 
for proper interprocessor communciation. 

The leveling off and decay points may be different for 
various multiple processor organizations. Ideally, we 
would like to operate on the linear portion of the curve, 
which can be extended by reducing the amount of con- 
tention and interprocessor communication activity. This 
can be achieved by partitioning the main job into smaller 
independent tasks that require little intercommunica- 
tion. 


Resource sharing. A general characteristic of most 
multiple processor systems, resource sharing is influ- 
enced mainly by financial considerations. In general, 
when the utilization factor of a given resource Is low, it 
makes sense to timeshare the resource among the pro- 
cessing elements in the system. 

Timesharing of resources must be done in an orderly 
manner to avoid resource overload and to maintain con- 
flict-free systems. The types of resources that may be 
shared in a multiple processor system range from a dumb 
printer or memory module to a sophisticated high-speed 
arithmetic processor. 


System reliability measures. The classical definition of 
reliability is the conditional probability that a system will 
survive interval (0, 47), given that it was operational at 
time ¢ = 0. A measure of reliability is mean time before 
failure. 

For multiple processor systems, where execution of a 
given job may require the cooperation of several pro- 
cessors, some of which may operate under different ex- 
ternal conditions, the classical definition of reliability is 
too general. For these systems, reliability is more ap- 
propriately defined as the probability of executing a 
given task under a given condition for a specified time. !? 

For reliability analysis, most systems can be classified 
as redundant or nonredundant systems.2° In nonredun- 
dant systems, each component must function properly 
for the system to work. In redundant systems, duplica- 
tion of part, or all, of the system ensures at least limited 
operation in the event of a failure. To obtain the reliabili- 
ty model of a redundant system, both the reliability of 
each system module and the model of the fault-tolerant 
scheme that 1s used must be determined. 

Since most multiple processor organizations are in- 
herently redundant, they can be classified as redundant 
systems. In addition, their reliability can be improved by 
duplicating both physical elements and processor tasks. 


Fault tolerance. A fault-tolerant system is capable of 
overcoming hardware malfunctions and/or software er- 
rors without human intervention, thus extending overall 
system MTBF beyond that of individual elements. To 
achieve this, the system may employ either massive or 
selective redundancy.?! 


Systems with massive redundancy use identical units 
Operating simultaneously to protect against failures 
and/or errors. The effects of the faulty unit are logically 
masked out by the remaining, properly operating units. 

Systems with selective redundancy employ real-time 
recovery procedures to automatically switch from a faul- 
ty unit to a standby unit. Successful initiation of the 
recovery procedures involves fault detection, fault con- 
tainment, and fault diagnosis.?2 

Hardware faults and software errors must be detected 
as soon as possible. During the detection latency period 
(the time between fault occurrence and fault detection), 
the fault containment unit must prevent fault-damaged 
data from propagating through the system and contami- 
nating it. After detection, diagnostic programs deter- 
mine the extent of the failure and localize the faulty part. 
Subsequently, the part is logically isolated, and recovery 
procedures are initiated to restore overall system opera- 
tion while maintaining data integrity. 

In multiple processor systems, fault-tolerance is ob- 
tained by transferring job responsibilities. If the system 
design includes a limited number of standby elements, 
the functions of the faulty element will be assigned to one 
of them. Otherwise, the remaining, properly functioning 
elements will be asked to accept additional assignments. 
Fail-safe operation is obtained if transfer of responsibili- 
ty does not affect system performance. On the other 
hand, if the system maintains only reduced-capacity op- 
eration because of partial transfer and/or general slow- 
down, it is in a fail-soft mode. In the fail-soft mode, some 
computing power is traded for continuous operation. 


Multiple processor systems achieve 
flexibility mainly through software, 
but hardware must support it. 


Flexibility. The ease of reconfiguring system topology 
and reallocating job responsibilities among other system 
elements comprises the flexibility measure. This charac- 
teristic is necessary to facilitate system recovery pro- 
cedures in the case of element failure. 

Providing uninterrupted operation at full or reduced 
capacity requires real time flexibility—that is, system 
reconfiguration in real-time under program control. 
Multiple processor systems achieve flexibility mainly 
through software, but the hardware must support it. The 
introduction of flexibility generally implies the use of 
more complex hardware and software, which—if done 
improperly—can reduce individual element reliability. 

System flexibility can facilitate future system expan- 
sion and is closely related to the degree of system 
modularity. 


Serviceability. Both maintainability and repairability 
are aspects of serviceability. Maintainability is concerned 
with preventive maintenance—for example, continuous 
running of diagnostic routines and periodic maintenance 
checks. Repairability is the ease of detecting and locating 
hardware failures and/or software errors once the system 
is down. A measure of maintainability is mean time to 


COMPUTER 


repair, defined as the sum of the expected mean time for 
periodic maintenance and the expected time for repair 
after failure.2? 

Since serviceability is concerned with repairability, it 
complements the fault-tolerance property. Both guard 
against total system failure—fault-tolerance while the 
system is up, and serviceability while it is either up or par- 
tially or completely down. Fault-tolerance is introduced 
during the design phase as a logical function to mask out 
the effects of a faulty unit. Serviceability, on the other 
hand, is mainly physical in nature; it is concerned with 
the physical repair of a faulty part and the prevention of 
future failures. 

Multiple processor systems generally facilitate service- 
ability since total system complexity is broken down into 
simpler subsystems. The reduction in subsystem com- 
plexity implies that testing and repair will be easier and, 
occasionally, may be partially carried out while the 
system is operating. In some multiple processor systems, 
processing elements may have very similar or even iden- 
tical hardwre, thus eliminating the need for large spare- 
part inventories and, since the faulty unit can be replaced 
by a spare, maintaining a low MTTR. 


Availability. Availability is a figure of merit describing 
system availability to users, that is, the probability that 
the system will be operational at time ¢. It can be ex- 
pressed as the percentage of time the system is up (avail- 
able) and is given by 


MTBF 


Availability = _*"’ Fe 
MTBF + MTTR 

Multiple processor systems offer good availability 
because of higher MTBF figures achieved by higher reli- 
ability and lower MTTR figures obtained by improved 
serviceability. 


System design and development measures. In complex 
applications, it is desirable to partition the main job into 
smaller tasks to minimize interdependency between 
tasks. Ideally, each task would be assigned to a dedicated 
processor, thereby limiting interprocessor communica- 
tion to the data level. A properly partitioned multiple 
processor system improves performance in terms of 
system deployment, modularity, and prolonged life cy- 
cle; it also facilitiates human engineering. 


System deployment. A multiple processor system has a 
greater potential for use, even if it is only partially im- 
plemented, than a single processor system. Furthermore, 
the time required to become operational is less for a 
multiple processor system since the development, im- 
plementation, and installation phases can overlap. Once 
a functionally independent subsystem is developed, it 
can be implemented, installed, and used. Subsystems can 
be added to the existing section until the system is com- 
plete and operating at full capacity. 


Modularity. System modularity can be defined in 
terms of the compactness and isolation of all its 
elements. Modular designs, which feature independent, 


March 1983 


less complex hardware and software modules, generally 
shorten development and debug time and facilitate ser- 
viceability. 

Modular systems are also more responsive in that each 
subsystem’s software and hardware can be optimized for 
a specific task. This permits a faster, more efficient 
response. For example, consider a subsystem that con- 
trols a process requiring rapid responses to interrupts. If 
One processor is not sufficient, more processors can be 
added to optimize response time. 

System modularity can improve system versatility. The 
association of software functions with specific hardware 
modules, if done properly, enables controlling the soft- 
ware configuraton within the different end-user systems, 
thus making the system more versatile. 


Prolonged life cycle. System modularity also facilitates 
system enhancement. Traditionally, systems designed 
around a single processor had to be replaced once op- 
timal performance limits were reached. This implied a 
major financial investment, often beyond the means of 
an average user. However, a modular multiple-processor 
system can be upgraded to meet new requirements at 
minimal cost, thus prolonging system life. 

Enhancement of an existing system may be desirable to 
eliminate a bottleneck, add more features, and/or im- 
prove performance in terms of speed and power. Also, 
modularity implemented with multiple processor archi- 
tecture allows fine-tuning of system operation. One por- 
tion can be modified without affecting the rest of the 
system. 


Human engineering. Many computer-controlled 
systems are being developed for applications involving 
nonprofessional users. Such systems are generally highly 
interactive to accommodate users who have relatively lit- 
tle or no experience in computer technology, and they 
use human engineering concepts to provide simple 
man/machine interfaces that can be easily understood. 
This requirement can be achieved most economically by 
using a local, dedicated processor for the interface. 


When to use multiple microprocessor 
systems 


The application of a multiple microprocessor system is 
closely related to the capabilities of the various pro- 
cessors used in the system. In general, a microprocessor 
has neither the computational power nor the communi- 
cation flexibility of a larger computer, due to technolog- 
ical constraints and pin limitations, respectively. How- 
ever, the microprocessor—a versatile, low-cost source of 
computing power—has made digital processing practical 
and financially attractive for many new applications. 


Design considerations. Basic design considerations 
must be examined before we can define when to use 
multiple microprocessor systems. Whenever complex 
systems are considered, overall system operation must be 
decomposed into a number of relatively independent 
tasks; however, the problem of task partitioning in dis- 


29 


30 


tributed data processing systems will not be considered 
here, since it is highly application oriented and has been 
extensively discussed in the literature.2*7° 

In multiple microprocessor systems, tasks are initiated 
asynchronously by external stimuli and/or internally 
within the system, and they are executed by a number of 
cooperating elements. This may introduce execution dif- 
ficulties in terms of concurrency and conflict.2’ How- 
ever, the problem of concurrency and conflict can be 
dealt with independently of physical structure by using 
task, event, and communication concepts as logical 
equivalents of physical structures.28 To obtain har- 
monious operation, the system must be capable of the 
following control aspects: arbitration, allocation, and 
coordination. 


Arbitration. Efficient task/resource allocation de- 
pends on the resolution of all potential contention 
through identification of events/requests and verifica- 
tion of the status of requested tasks/resources. The real- 
time demands for tasks/resources are physical entitites in 
the form of asynchronous requests/interrupts. These 
demands can be represented by events which are their 
logical equivalents. Thus, the system software can treat 
all demands the same, regardless of their physical 
characteristics. 

Arbitration procedures can be carried out centrally by 
a single self-contained unit or in a decentralized fashion 
by small, dedicated units distributed among the various 
system elements.2? The most commonly used arbitration 
control schemes are daisy chain, polling, and asynchron- 
ous requests/interrupts. 


Task /resource allocation. Apart from the resolution 
of contention and subsequent arbitration, system re- 
sources must be properly allocated. A shared resource 
can be thought of as any system part, hardware or soft- 
ware, that could be used in more than one process, where 
a process is a logical entity related to the execution of any 
well-defined procedure. In multi-microcomputer sys- 
tems, a process is the logical grouping of one or more 
tasks with their associated data, where each task repre- 
sents the smallest independent entity of a procedure. As 
outlined previously, a multi-microcomputer system con- 
sists of a number of cooperating task execution units. In 
other words, process execution may involve assignment 
of a number of units. Thus, resource and task allocation 
are synonymous concepts in these systems. The alloca- 
tion method can be either static or dynamic.?° 


Task/resource interaction and coordination. In a 
multiple microprocessor system, some sort of interaction 
and coordination among several concurrent tasks is nec- 
essary. Thus, we can talk about intertask/process com- 
munication procedures instead of communication be- 
tween physical entities. This provides a uniform descrip- 
tion of all system elements and eliminates concern with 
actual hardware characteristics. Also, providing the pro- 
per task/process communication procedures at a logical 
level (before implementation) permits a more efficient 
definiton of actual physical communication procedures. 
The coordination problems associated with concurrent 


processes can be stated in terms of determinancy, syn- 
chronization, deadlock, and mutual exclusion.® 


System executive. All the above control aspects are in 
the domain of the system executive. The implementation 
of an effective multi-microcomputer system is greatly in- 
fluenced, if not determined, by the proper design of its 
executive. The design of a multiple microprocessor ex- 
ecutive is very complex?!:32 and beyond the scope of this 
article; however, ideas used in the design of real-time ex- 
ecutives for uniprocessor system are applicable.?3-*6 


Application characteristics. Multiple microprocessor 
system implementation should be considered for 


e Elaborate process-control applications with diver- 
sified computational demands and real-time con- 
straints. For most of these applications, the heavy 
processing requirements far exceed the capabilities 
of a single microprocessor-based system. 

e Applications with extensive I/O processing. The 
need to interface with a large variety of I/O pro- 
cesses generally imposes unacceptable control 
overhead on a single microprocessor and causes 
severe degradation in system responsiveness. 

e Applications that demand high reliability but, due 
to financial and/or space constraints, cannot sup- 
port massive redundancy. 


Avoid using multiple microprocessor systems If one or 
both of the following conditions exists: 


e All that is needed is more raw processing power. In 
this case, it is advantageous to get a more powerful 
CPU or to add one or more dedicated, specialized 
processors. 

¢ The global task to be executed cannot be partitioned 
into relative independent tasks with minimal inter- 
task communication needs. 


Distributed microprocessor system architecture—an 
example. Up to now, we have discussed multiple micro- 
processor systems in general terms. We now present a 
specific distributed architecture developed to exploit the 
advantages of concurrent processing while maintaining a 
simple, reliable system. 

The task-driven multi-microcomputer system?’ con- 
sists of a hierarchical controller that supervises a number 
of heterogeneous processors, each having private pro- 
gram memory, read/write data memory, and some I/O 
capabilty. The system may also have global resources 
that include a global data memory used as a message 
center and for storage of common variables. The con- 
troller, called the task allocation and arbitration unit, ac- 
cepts requests from internal and external sources and in- 
itiates the proper control actions. The required task syn- 
chronization and coordination are done via asyn- 
chronous handshaking signals through a control/hand- 
shake bus. A simplified block diagram of a task-driven 
multi-microcomputer system is shown in Figure 5. 

The various tasks available to the system are per- 
manently stored in the local program memories of the in- 


7 


COMPUTER 


EXTERNAL REQUEST 


CONTROL/HANDSHAKE BUS 


LOCAL 
MEMORY 


PROCESSOR J 


r- 
I 
I 
[ 
l 
1 
l 


DATA BUS 


TASK ALLOCATION 


sow eta aca 


AND 


ARBITRATION UNIT 


LOCAL 
MEMORY 


PROCESSOR N 


MAILBOX MEMORY 


Figure 5. Simplified block diagram of a task-driven multi-microcomputer system. 


dividual processors. When a task is to be executed, it is 
awakened by specifying its starting address, not down- 
loaded from a central memory. For reliability and to 
maintain system performance at a specified level, each 
task is stored in two or more local program memories. 

A dedicated controller ‘‘frees’’ the processors from 
performing anything but their assigned tasks. This 
reduces individual processor complexity, which in turn 
results in a modular, more reliable unit. By designing the 
controller to be a highly reliable unit, complete system 
reliability is enhanced. The design of a general-purpose 
executive for a task-driven system was outlined in 
another work. 


The trend toward distributed microprocesor systems 
described in this article is being recognized by micropro- 
cessor manufacturers. They are now producing chips 
that simplify interaction and coordination by providing 
additional control signals as well as real-time, multitask- 
ing Operating system primitives. However, this is only a 
first step. To develop effective distributed microproces- 
sor systems, additional work is needed in the areas of 
hardware architecture, system executives, and communi- 
cation facilities. @ 


March 1983 


References 


l. 


P.M. Russo, ‘“‘Interprocessor Communjcation for Multi- 
Microcomputer Systems,’’ Computer, Vol. 10, No. 4, 
Apr. 1977, pp. 67-76. 


C. V. Ramamoorthy et al., ‘“‘Hardware Software Issues in 
Multimicroprocessor Computer Architectures,’’ Proc. 
First Annual Rocky Mountain Symp. Microcomputers: 
Systems, Software, Architecture, Fort Collins, Colo., 
Aug./Sept. 1977, pp. 73-99. 


G. A. Anderson and E. D. Jensen, ‘‘Computer Intercon- 
nection Structures: Taxonomy, Characteristics, and Ex- 
amples,’’ Computing Surveys, Vol. 7, No. 4, Dec. 1975, 
pp. 197-214. 


S.H. Fuller et al., ‘‘Multi-Microprocessors: An Overview 
and Working Example,’’ Proc. IEEE, Vol. 6, No. 2, Feb. 
1978, pp. 216-218. 


M. Schwartz, Computer Communication Network Design 
and Analysis, Prentice-Hall, Englewood Cliffs, N. J., 
1977. 


B. A. Bowen and R. J. A. Buhr, The Logical Design of 
Multiple Microprocessor Systems, Prentice-Hall, Engle- 
wood Cliffs, N. J., 1980. 


P. H. Enslow, Jr., ‘‘Multiprocessor Organization—A 
Survey,’’ Computing Surveys, Vol. 9, No. 1, Mar. 1977, 
pp. 103-129. 


INTERNAL REQUEST 


i ola aaa ae 


31 


32 


8. 


16. 


17. 


19. 


20. 


pA 


bes 


23. 


24. 


pe 3 


26. 


aT: 


28. 


29. 


M. Satyanarayanan, ‘‘Commercial Multiprocessing 
Systems,’’ Computer, Vol. 13, No. 5, May 1980, pp. 
75-96. 

L. C. Widdoes, Jr., ‘‘The Minerva Multi-Micropro- 
cessor,’’ Proc. Third Ann. Symp. Computer Architecture, 
Clearwater, Fla., Jan. 1976, pp. 34-39. 


L. H. Anderson, ‘‘The Microcomputer as Distributed In- 
telligence,’’ Proc. Int’l Symp. Circuits and Systems, 
Boston, Mass., Apr. 1975, pp. 337-340. 


. W. L. Spetz, ‘‘Microprocessor Networks,’’ Computer, 


Vol. 10, No. 7, July 1977, pp. 64-70. 


. M. J. Flynn, ‘“‘Very High Speed Computing Systems,’’ 


Proc. IEEE, Vol. 54, No. 12, Dec. 1966, pp. 1901-1909. 


J. L. Baer, ‘‘Multiprocessing Systems,’’ JEEE Trans. 
Computers, Vol. C-25, No. 12, Dec. 1976, pp. 1271-1277. 


. L. C. Higbie, ‘‘Super Computer Architecture,’’ Com- 


puter, Vol. 6, No. 12, Dec. 1973, pp. 48-58. 


D. Prener, ‘‘Large Multimicroprocessor Systems,’’ 
Microprocessors and Microsystems, Vol. 3, No. 6, July/ 
Aug. 1979, pp. 271-276. 


M. Krieger and E. T. Fathi, ‘‘Design Aspects of a Simple 
Distributed Microprocessor System,’’ Int’! Conf. Com- 
munication, Circuits, and Systems, Yadvapur University, 
Calcutta, Dec. 1981. 


A. Baum and D. Senzig, ‘‘Hardware Considerations in a 
Microcomputer Multiprocessing System,’’ Digest of 
Papers Compcon Spring 75, San Francisco, Calif., Feb. 
1975, pp. 27-30. 


C. J. Jenny, ‘‘Process Partitioning in Distributed 
Systems,’’ IEEE NTC Conf. Record, 1977, Vol. 2, pp. 
31:1-1 to 31:1-10. 


C. G. Davis and C. R. Vick, ‘‘The Software Development 
System,’’ JEEE Trans. Software Engineering, SE-3, No. 
1, Jan. 1977, pp. 69-84. 


D. P. Siewiorek, ‘‘Multiprocessors: Reliability, Modeling, 
and Graceful Degradation,’’ System Reliability and In- 
tegrity, State-of-the-Art Report, Infotech, Ltd., Maiden- 
head, England. 


C. Weitzman, Distributed Micro/Minicomputer Systems, 
Structure, Implementation, and Applications, Prentice- 
Hall, Englewood Cliffs, N. J., 1980. 


D. A. Rennels et al., ‘‘Distributed Fault-Tolerant Com- 
puter Systems,’’ Computer, Vol. 13, No. 3, Mar. 1980, pp. 
55-65. 

D. Popovic and D. Danziger, ‘*Total Life Calculations 
With and Without Maintenance,’’ Microprocessors and 
Microsystems, Vol. 3, No. 6, July/Aug. 1979, pp. 
257-261. 


B. P. Buckles and D. M. Hardin, ‘‘Partitioning and Allo- 
cation of Logical Resources in a Distributed Computing En- 
vironment,’”’ Tutorial: Distributed System Design, TEEE 
Computer Society, 1979, pp. 247-276. 


J. T. Lawson and M. P. Mariani, ‘‘Distributed Data Pro- 
cessing System Design—A Look at the Partitioning Prob- 
lem,’’ Proc. Compsac 78, Chicago, IIl., pp. 358-363. 


E. D. Jensen and W. E. Boebert, ‘‘Partitioning and 
Assignment of Distributed Processing Software,’’ Digest 
of Papers Compcon Fall 77, pp. 348-352. 


Y. P. Chien, ‘‘Multitasking Executive Simplifies Realtime 
Microprocessor System Design,’’ Computer Design, Jan. 
1979, pp. 109-117. 


P. Brinch-Hansen, ‘‘A Keynote Address on Concurrent 
Programming,’’ Computer, Vol. 12, No. 5, May 1979, pp. 
50-56. 

K. J. Thurber et al., ‘‘A Systematic Approach to the 
Design of Digital Bussing Structures,’? AFIPS Conf. 
Proc., Vol. 41-II, 1972 FJCC, pp. 719-740. 


30. B.C. Searle and D. E. Freberg, ‘‘Microprocessors,’’ Com- 
puter, Vol. 8, No. 10, Oct. 1975, pp. 75-83. 


31. E. T. Fathi, ‘‘Task-Driven Multi-Microcomputer Sys- 
tem,’’ master’s thesis, University of Ottawa, 1981. 


32. P. Brinch-Hansen, ‘‘Distributed Processes—A Concur- 
rent Programming Concept,’’ Comm. ACM, Vol. 21, No. 
11, Nov. 1978, pp. 934-941. 


33. K.C. Kahn, ‘‘A Small-Scale Operating System Founda- 
tion for Microprocessor Applications,’’ Proc. IEEE, Vol. 
66, No. 2, Feb. 1978, pp. 209-216. 


34. D. A. Townsen, ‘‘A Task Scheduling Executive Program 
for Microcomputer Systems,’’ Computer Design, Vol. 16, 
No. 6, June 1977, pp. 194-202. 


35. F. V. D. Linden and I. Wilson, ‘‘Real-Time Executive for 
Microprocessors,’’ Microprocessors and Microsystems, 
Vol. 4, No. 6, July/Aug. 1980, pp. 211-218. 


36. C.J. Tavora, ‘‘A Basic Technique for Real-Time System 
Design,’’ Computer Design, Vol. 19, No. 10, Oct. 1980, 
pp. 147-152. 


37. M. Krieger, ‘‘Task-Driven Multi-Microprocessor 
System,’’ Proc. First Canadian Workshop Design and De- 
velopment of Computer Systems, May 1979, pp. 81-88. 


38. E. T. Fathi and M. Krieger, ‘‘Executive for Task-Driven 
Multi-Microcomputer Systems,’’ accepted for publication 
in IEEE Micro. 


_Eli T. Fathi is vice-president for engineer- 
ing at C. V. W. Armstrong Consultants, 
-Ltd., and is currently designing a multiple- 
microprocessor system for radar signal 
processing. Previously, he worked with 
the Royal Canadian Mounted Police and 
Miller Communication Systems, Ltd., 
where he was involved in the development 
of various microprocessor-based systems. 

Fathi received his BASc and MASc de- 


grees in electrical engineering from the University of Ottawa, 


Ontario, Canada, in 1978 and 1981, respectively. He is a mem- 
ber of the IEEE and the Association of Professional Engineers 
of Ontario. 


Moshe Krieger, an associate professor in 
the Department of Electrical Engineering at 
. the University of Ottawa, is currently a vis- 
iting professor in the Electrical and Com- 
puter Engineering Department of Syracuse 
F University. His research interests are in 
distributed systems, special-purpose mic- 
roprocessor systems, switching circuits, 
and reliability. He authored the book 

é, Basic Switching Circuit Theory, published 
by Macmillan in 1967. 

Krieger received a BSc degree in electrical engineering from 
Technion Israel Institute of Technology in 1959, an MASc 
degree in electrical engineering from the University of Toronto 
in 1961, and a PhD degree in electrical engineering from Syra- 
cuse University in 1967. 


COMPUTER 


